Search CORE

105 research outputs found

Experiments on applying relaxation labeling to map multilingual hierarchies

Author: Daude Ventura Jordi
Padró Lluís
Rigau Claramunt German
Publication venue
Publication date: 01/01/1999
Field of study

This paper explores the automatic construction of a multilingual Lexical Knowledge Base from preexisting lexical resources. This paper presents a new approach for linking already existing hierarchies. The Relaxation labeling algorithm is used to select --among all the candidate connections proposed by a bilingual dictionary-- the right conection for each node in the taxonomy.Postprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Multilingual knowledge resources for wide–coverage semantic processing

Author: Cuadros Oller Montserrat
Rigau Claramunt German
Publication venue: Sociedad Española para el Procesamiento del Lenguaje Natural
Publication date: 01/01/2008
Field of study

Este artículo presenta el resultado del estudio de un amplio conjunto de bases de conocimiento multilíngües actualmente disponibles que pueden ser de interés para un gran número de tareas de procesamiento semántico a gran escala. El estudio incluye una amplia gama de recursos derivados de forma manual y automática para el inglés y castellano. Con ello pretendemos mostrar una imagen clara de su estado actual. Para establecer una comparación justa y neutral, la calidad de cada recurso se ha evaluado indirectamente usando el mismo método en dos tareas de resolución de la ambigüedad semántica de las palabras (WSD, del inglés Word Sense Disambiguation). En concreto, las tareas de muestra léxica del inglés del Senseval-3.This report presents a wide survey of publicly available multilingual Knowledge Resources that could be of interest for wide–coverage semantic processing tasks. We also include an empirical evaluation in a multilingual scenario of the relative quality of some of these large-scale knowledge resources. The study includes a wide range of manually and automatically derived large-scale knowledge resources for English and Spanish. In order to establish a fair and neutral comparison, the quality of each knowledge resource is indirectly evaluated using the same method on a Word Sense Disambiguation task (Senseval-3 English Lexical Sample Task).Este trabajo ha sido parcialmente financiado por grupo IXA de la UPV/EHU y los proyectos KNOW (TIN2006-15049-C03-01) y ADIMEN (EHU06/113)

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Multilingual evaluation of KnowNet

Author: Cuadros Oller Montserrat
Rigau Claramunt German
Publication venue
Publication date: 01/01/2008
Field of study

Este artículo presenta un nuevo método totalmente automático de construcción de bases de conocimiento muy densas y precisas a partir de recursos semánticos preexistentes. Básicamente, el método usa un algoritmo de Interpretación Semántica de las palabras preciso y de amplia cobertura para asignar el sentido mas apropiado a grandes conjuntos de palabras de un mismo tópico que han sido obtenidas de la web. KnowNet, la base de conocimiento resultante que conecta grandes conjuntos de conceptos semánticamente relacionados es un paso importante hacia la adquisición automática de conocimiento a partir de corpus. De hecho, KnowNet es varias veces mas grande que cualquier otro recurso de conocimiento disponible que codifique relaciones entre sentidos, y el conocimiento que KnowNet contiene supera cualquier otro recurso cuando es empíricamente evaluado en un marco multilingüe común. This paper presents a new fully automatic method for building highly dense and accurate knowledge bases from existing semantic resources. Basically, the method uses a wide-coverage and accurate knowledge-based Word Sense Disambiguation Algorithm to assign the most appropriate senses to large sets of topically related words acquired from the web. KnowNet, the resulting knowledge-base which connects large sets of semantically-related concepts is a major step towards the autonomous acquisition of knowledge from raw corpora. In fact, KnowNet is several times larger than any available knowledge resource encoding relations between synsets, and the knowledge KnowNet contains outperform any other resource when is empirically evaluated in a common multilingual framework.Peer ReviewedPostprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

SemEval-2007 Task 16: evaluation of wide coverage knowledge resources

Author: Cuadros Oller Montserrat
Rigau Claramunt German
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2007
Field of study

This task tries to establish the relative quality of available semantic resources (derived by manual or automatic means). The quality of each large-scale knowledge resource is indirectly evaluated on a Word Sense Disambiguation task. In particular, we use Senseval-3 and SemEval-2007 English Lexical Sample tasks as evaluation bechmarks to evaluate the relative quality of each resource. Furthermore, trying to be as neutral as possible with respect the knowledge bases studied, we apply systematically the same disambiguation method to all the resources. A completely different behaviour is observed on both lexical data sets (Senseval-3 and SemEval-2007).Peer ReviewedPostprint (author’s final draft

UPCommons. Portal del coneixement obert de la UPC

KnowNet: A proposal for building highly connected and dense knowledge bases from the web

Author: Cuadros Oller Montserrat
Rigau Claramunt German
Publication venue
Publication date: 01/01/2008
Field of study

This paper presents a new fully automatic method for building highly dense and accurate knowledge bases from existing semantic resources. Basically, the method uses a wide-coverage and accurate nowledge-based Word Sense Disambiguation algorithm to assign the most appropriate senses to large sets of topically related words acquired from the web. KnowNet, the resulting knowledge-base which connects large sets of semantically-related concepts is a major step towards the autonomous acquisition of knowledge from raw corpora. In fact, KnowNet is several times larger than any available knowledge resource encoding relations between synsets, and the knowledge KnowNet contains outperform any other resource when is empirically evaluated in a common multilingual framework.Peer ReviewedPreprint (author's version

CiteSeerX

UPCommons. Portal del coneixement obert de la UPC

Highlighting relevant concepts from Topic Signatures

Author: Cuadros Oller Montserrat
Padró Lluís
Rigau Claramunt German
Publication venue
Publication date: 01/01/2012
Field of study

This paper presents deepKnowNet, a new fully automatic method for building highly dense and accurate knowledge bases from existing semantic resources. Basically, the method applies a knowledge-based Word Sense Disambiguation algorithm to assign the most appropriate WordNet sense to large sets of topically related words acquired from the web, named TSWEB. This Word Sense Disambiguation algorithm is the personalized PageRank algorithm implemented in UKB. This new method improves by automatic means the current content of WordNet by creating large volumes of new and accurate semantic relations between synsets. KnowNet was our first attempt towards the acquisition of large volumes of semantic relations. However, KnowNet had some limitations that have been overcomed with deepKnowNet. deepKnowNet disambiguates the first hundred words of all Topic Signatures from the web (TSWEB). In this case, the method highlights the most relevant word senses of each Topic Signature and filter out the ones that are not so related to the topic. In fact, the knowledge it contains outperforms any other resource when is empirically evaluated in a common framework based on a similarity task annotated with human judgementsPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

A Proposal for word sense disambiguation using conceptual distance

Author: Agirre Eneko
Rigau Claramunt German
Publication venue
Publication date: 01/01/1996
Field of study

This paper presents a method for the resolution of lexical ambiguity and its automatic evaluation over the Brown Corpus. The method relies on the use of the wide-coverage noun taxonomy of WordNet and the notion of conceptual distance among concepts, captured by a Conceptual Density formula developed for this purpose. This fully automatic method requires no hand coding of lexical entries, hand tagging of text nor any kind of training process. The results of the experiment have been automatically evaluated against SemCor, the sense-tagged version of the Brown Corpus.Postprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

The MEANING Project

Author: Agirre Bengoa Eneko
Atserias Batalla Jordi
Rigau Claramunt German
Publication venue: Sociedad Española para el Procesamiento del Lenguaje Natural
Publication date: 01/01/2003
Field of study

A pesar del progreso que se realiza en el Procesamiento del Lenguaje Natural (PLN) aún estamos lejos de la Comprensión del Lenguaje Natural. Un paso importante hacia este objetivo es el desarrollo de técnicas y recursos que traten conceptos en lugar de palabras. Sin embargo, si queremos construir la próxima generación de sistemas inteligentes que traten Tecnología de Lenguaje Humano en dominios abiertos necesitamos resolver dos tareas intermedias y complementarias: resolución de la ambigüedad léxica de las palabras y enriquecimiento automático y a gran escala de bases de conocimiento léxico.Progress is being made in Natural Language Processing (NLP) but there is still a long way towards Natural Language Understanding. An important step towards this goal is the development of technologies and resources that deal with concepts rather than words. However, to be able to build the next generation of intelligent open domain Human Language Technology (HLT) application systems we need to solve two complementary and intermediate tasks: Word Sense Disambiguation (WSD) and automatic large-scale enrichment of Lexical Knowledge Bases.The MEANING Project is funded by the EU 5th Framework IST Programme

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Asignación automática de etiquetas de dominios en WordNet

Author: Castillo Valdés Mauro
Real Vázquez Francis
Rigau Claramunt German
Publication venue: Sociedad Española para el Procesamiento del Lenguaje Natural
Publication date: 01/01/2003
Field of study

En este artículo se describe un procedimiento para asignar de forma automática etiquetas de dominio a las glosas de WordNet. Una de las motivaciones principales del trabajo es enriquecer fuentes léxicas con información de WordNet. Para ello, se utilizan los WordNet DOMAINS. Finalmente, se proponen y corrigen etiquetas de dominios para la parte nominal y verbal de WordNet.This paper describes a process to automatically assign wordnet domain labels to WordNet glosses. One of the main goals of this work is to enrich lexical sources with WordNet information. WordNet domains are used as knowledge source. Finally, Domain labels for nouns and verbs are suggested and verified.Este artículo ha sido financiado parcialmente por la Comisión Europea (MEANING IST-2001-34460), Generalitat de Catalunya (2002FI 00648) y Universidad Tecnológica Metropolitana - Chile

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Secretaría de Estado de Cultura

Exploring the automatic selection of basic level concepts

Author: Izquierdo Beviá Rubén
Rigau Claramunt German
Suárez Cueto Armando
Publication venue: INCOMA
Publication date: 01/01/2007
Field of study

We present a very simple method for selecting Base Level Concepts using basic structural properties of WordNet. We also empirically demonstrate that these automatically derived set of Base Level Concepts group senses into an adequate level of abstraction in order to perform class-based Word Sense Disambiguation. In fact a very naive Most Frequent classifier using the classes selected is able to perform a semantic tagging with accuracy figures over 75%.Union Europea bajo proyecto QALL-ME (FP6 IST-033860) y el Gobierno Español bajo el proyecto Text-Mess (TIN2006-15265-C06-01) y KNOW (TIN2006-15049-C03-01

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas